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Abstract 

In  the  look-down/shoot-down  scenario ,  the  next  genera¬ 
tion  of  air-to-surface  missiles  will  rely  on  I R  sensors 
and  advanced  signal  processing  to  detect  small  (or  point) 
targets  in  highly  cluttered  and  noisy  environments.  In 
this  paper ;  we  present  a  novel  wavelet  detection  algo¬ 
rithm  which  incorporates  adaptive  CFAR  detection 
statistics  using  the  bootstrap  method.  Following  detec¬ 
tion ,  the  estimate  of  interframe  optical  flow  is  made 
using  synthetic  discriminant  filters  (SDF's).  The  detec¬ 
tion  coupled  with  the  new  optical  flow  estimate  will 
enable  higher  performance  in  tracking  small  maneuver- 
able  targets.  Results  for  the  wavelet  bootstrap  detection 
are  presented  and  compared  to  a  conventional  matched 
filter. 

1.  Introduction 

Detection  of  small  targets  in  the  look-down  shoot- 
down  situation  is  becoming  increasingly  more  difficult 
as  target  signatures  become  more  noise-like  and  en¬ 
gagements  take  place  in  cluttered  environments  where 
the  clutter  is  structured  or  extended.  Such  cluttered,  lc  / 
signal  to  noise  ratio  (SNR)  environments  push  the 
limits  of  conventional  detection  and  tracking  algo¬ 
rithms.  This  scenario  represents  a  challenging  and  un¬ 
solved  problem.  Modem  staring  focal  plane  arrays 
(FPA’s)  with  their  superior  acquisition  ranges  and  coun¬ 
termeasure  resistance  can  potentially  be  used  to  solve 
the  look-down  shoot-down  problem  if  two  problems  can 
be  overcome.  First,  the  structured  clutter  present  in 
these  situations  is  not  fully  described  by  the  simple 
(Gaussian)  models  commonly  used  in  matched  filtering 
[1]  which  results  in  lower  detection  rates  and  more  false 
alarms.  The  second  problem  is  the  presence  of  fixed 
pattern  noise  in  the  IR  sensor  itself.  Fixed  pattern 
noise  results  from  the  nonuniform  response  of  the  detec¬ 
tors  to  a  uniform  incoming  signal.  Wavelet  based 
image  processing  enables  one  to  overcome  these  two 
problems. 

This  research  has  quantified  the  utility  of  wavelet- 
based  image-processing  algorithms  by  addressing  the 
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point  target-detection  problem  in  fixed  pattern  noise  and 
in  structured  clutter.  This  research  follows  and  parallels 
the  standard  detection  paradigm  that  has  been  success¬ 
fully  used  in  radar  detector  design.  Our  research  has 
proceeded  by  first  solving  the  detection  problem  when 
only  array  noise  is  present  by  characterizing  the  detector 
array  nonuniformity  response.  Next,  we  formulated  an 
optimum  detection  algorithm  while  recognizing  the 
array  noise  limits  by  defining  an  adaptive  constant  false 
alarm  rate  (CFAR)  point  target-detection  algorithm  that 
maximizes  the  detection  signal-to-clutter  ratio.  The 
effectiveness  of  these  detection  algorithms  are  evaluated 
by  using  dynamically  controlled  point  targets  embedded 
in  a  selected  set  of  measured  IR  backgrounds.  The  point 
targets  are  modeled  using  a  simulated  Gaussian  target 
with  parametrically  varying  amplitude,  size,  and  polar¬ 
ity  embedded  in  both  fixed  pattern  noise  and  in  scene- 
based  video  images. 

Wavelets  have  properties  that  suit  them  to  look-down- 
shoot  down  antiair  and  air-to-surface  problems;  for 
example,  they  are  scaleable — image  features  at  one  scale 
can  be  effectively  rejected,  while  other  features  (like  a 
small  target)  can  be  searched  out  preferentially.  Wav¬ 
elets  can  extract  spatial  information  (e.g.,  edges)  and  can 
“whiten”  the  fixed  pattern  1/f-type  noise  present  on 
InSb  and  HgCdTe  staring  arrays  [2-5].  The  number  of 
additions  and  multiplication’s  needed  to  compute  the 
fastest  (Daubechies)  wavelet  transform  is  directly  pro¬ 
portional  to  the  number  of  pixels  in  an  array  and  can  be 
implemented  on  commercial  off-the-shelf  (COTS)  digi¬ 
tal  signal-processing  hardware. 

The  rest  of  the  paper  is  organized  as  follows. 
The  first  section  gives  a  brief  overview  of  wavelets,  and 
highlights  the  reasons  that  wavelets  are  beneficial  in 
this  case.  The  section  is  written  to  give  a  general  de¬ 
scription  of  wavelet  theory,  and  the  reader  is  encouraged 
to  check  the  references  [6-8]  for  more  details.  The  sec¬ 
ond  section  describes  the  wavelet  CFAR  detection  algo¬ 
rithm.  The  algorithm  employs  bootstrap  statistical 
processing,  and  this  topic  is  discussed  at  length  in  the 
second  section.  Also  in  this  section  one  of  the  problems 
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Figure  1  Examples  of  Wavelets. 


and  is  defined  as  follows: 
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and  the  companion  scaling  function  is 
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discussed  in  the  introduction,  namely  fixed  pattern  noise 
is  covered.  In  addition  to  the  wavelet  CFAR  detector, 
the  synthetic  discriminant  filters  as  developed  by  Maha- 
lanobis  etal[ 9-13]  are  discussed  and  how  they  might  be 
used  for  optical  flow  estimates.  The  third  section  pres¬ 
ents  the  overview  of  integrated  detection  and  tracking.. 
Finally  we  present  some  results  and  conclusions. 

2.  Wavelet  Overview 

The  development  of  wavelets  is  fairly  recent,  most  of 
the  basic  theory  having  been  developed  in  the  past  10  or 
more  years  with  major  contributions  from  researchers  in 
France  and  the  United  States.  Virtually  any  function — 
subject  to  some  simple  constraints — can  be  a  wavelet. 
Since  the  formalism  of  the  wavelet  transform  was  first 
introduced  by  Grossman  and  Morlet  [14],  many  excel¬ 
lent  books  [6-8,15-20]  including  the  classic  SIAM 
monograph  by  Daubechies  have  appeared.  With  this 
wealth  of  material  available,  here  we  will  only  give  a 
brief  overview  of  wavelet  analysis  and  how  wavelets  can 
be  brought  to  bear  on  the  detection  problem. 

Wavelet  transforms  translate  and  dilate  a  suita¬ 
bly  chosen  mother  wavelet,  which  decomposes  the 
signal  into  its  local  multiscale  resolution  (coarse  to 
fine).  Although  wavelets  are  at  least  as  fundamental  as 
Fourier  analysis,  u^j  are  more  flexible  and  provide 
information  unavailable  from  the  Fourier  transform. 
Like  Fourier  analysis,  wavelets  can  be  interpreted  as  a 
basis  set  in  some  normed  function  space.  Wavelets  are 
compactly  supported  (non-zero  on  some  finite  interval 
which  provides  good  localization  properties.  Localized 
processing  means  targets  and  clutter  retain  their  proper 
locations  before  and  after  processing,  which  is  ex¬ 
tremely  important  for  detection.  The  research  of  Car¬ 
mona  [21]  and  Mallat  [22]  suggests  that  transients  will 
be  localized  in  certain  frequency  subbands  of  the  wavelet 
transform. 

An  example  of  a  simple  wavelet  is  the  Haar 
wavelet  which  has  compact  support,  is  quite  simple, 


The  panels  in  Figure  1  illustrate  the  Mallat,  Haar,  and 
Daubechies  wavelets,  respectively.  Although  Haar  wav¬ 
elets  are  simple  and  compact  in  the  time  domain,  they 
are  not  well  localized  in  Fourier  space  where  they  decay 
very  slowly  as  co~\  because  of  the  sharp  discontinuity. 
Figure  2  is  an  example  of  the  standard  wavelet  trans¬ 
form  in  a  packet  architecture.  The  labels  L  (scaling 
coefficients)  and  H  (wavelet  coefficients)  suggest  their 
role  as  low-  and  high-  pass  filters  respectively.  As  an 
example,  for  the  Haar  wavelet  L  is  an  averaging  filter 
and  H  is  a  differencing  filter.  Their  elements  are 

L={l/V2  ,  1/V2}  and  B={l/V2,  -1/V2).  The 

wavelet  packet  transform  is  a  generalized  version  of  the 
wavelet  transform  (Reference  v).  Repeating  the  wavelet 
transform  by  using  the  output  of  either  or  both  the  low 
(L)-  and  high  (H)-pass  filters  as  new  inputs  creates  a 
variable  bandwidth  filter  that  is  multiresolutional.  The 
input  signal  is  decomposed  into  low-  and  high-frequency 
bands  by  convolving  the  signal  with  the  filter  and  then 
subsampling  (e.g.  downsampling  by  2  ). 


NPUT  — ' 


FIGURE  2.  Wavelet  Packet  Architecture . 


The  two-dimensional  transform  for  images  is 
easily  constructed  from  a  tensor  product  of  the  one¬ 
dimensional  wavelet  filters  in  the  appropriate  order  as 
horizontal  and  vertical  convolutions.  As  the  image  is 
decomposed  by  these  filters,  their  frequency  range  di¬ 
vides  the  image  into  frequency  subbands.  Wavelet 
analysis  offers  a  time-frequency  or  space-frequency 
tradeoff  that  captures  low-frequency  signals  with  great 
frequency  accuracy  and  high-frequency  signals  with  great 
temporal  accuracy — a  very  reasonable  tradeoff,  since  we 
can  not  simultaneously  have  both  time  and  frequency 
accuracy  ,  because  of  the  Heisenberg  uncertainty  princi¬ 
ple.  The  dimension  of  the  wavelet  subimages  is  directly 
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related  to  the  original  image.  When  the  maximally 
decimated  transforms  are  downsampled  by  two,  the 
dimension  of  the  image  is  decreased  by  1/2  at  each  node 
(i.e.,  image  dimension  is  decreased  by  1/4).  Thus,  the 
transformed  image  will  not  require  any  more  memory 
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Figure  3.  Wavelet  Transform 

than  the  original  image.  Maximally  decimated  means 
that  the  remaining  sample  after  decimation  obeys  the 
Nyquist  sampling  rate. 

Figure  3  illustrates  a  wavelet  decomposition  of  an 
image  into  unequal  subbands.  The  ordering  of  the  tree 
branches  are  shown  in  the  pyramid  structure.  The  image 
is  split  by  iterating  the  four-band  decomposition  three 
times.  The  lo west-frequency  components  are  located  in 
the  upper  left  corner. 

In  Figure  4a512by512  image  and  its  wavelet  trans¬ 
form  are  displayed.  Smooth  regions  of  the  original 
image  will  appear  in  the  low-frequency  subimage,  and 
sharp  edges  will  appear  in  the  high-frequency  su¬ 
bimages.  The  high  pass  filters  are  labeled  by  d  an  1  the 
low  pass  filters  are  labeled  by  s.  For  example,  the  dl-dl 
256  by  256  subimage  in  the  upper  right  comer  repre¬ 
sents  the  vertical  and  horizontal  convolutions  using  the 
H  filter.  It  has  mostly  all  high-frequency  features.  The 
256  by  256  si -si  subimage,  which  has  mostly  all  low- 
frequency  features,  represents  the  vertical  and  horizontal 
convolutions  with  the  L  filter.  It  was  used  as  the  new 
input  image  for  the  next  application  of  the  wavelet 
transform.  The  s3-s3  128  by  128  subimage  in  the  lower 
left  corner  is  the  output  of  the  third  application  of  the 
vertical  and  horizontal  convolutions  with  the  L  filter. 
The  white  patches  are  the  multiresolution  version  of  the 
original  vehicles;  some  detail  in  the  subimages  is 
washed  out  because  of  the  graphical  transcription  into 
this  document  format. 

To  summarize,  wavelets  are  suitably  defined  functions 
that  can  be  used  to  decompose  signals  into  a  time- 
frequency  or  space-frequency  space.  The  ability  of  a  few 
wavelet  coefficients  to  efficiently  represent  an  image 
within  the  reduced  dimensional  subimages  means  that 


the  computational  burdens  associated  with  pattern- 
recognition  algorithms  can  be  accommodated.  In  addi¬ 
tion  to  being  computationally  efficient  and  being  per¬ 
fectly  invertible,  wavelet  transforms  have  a  variety  of 
other  properties  which  make  them  well  suited  to  the 
task  of  point  target  detection.  These  properties  are 
described  in  the  next  section. 

3.  Bootstrap,  Wavelets  and  CFAR 
Detectors 

Detection  of  pulses  in  the  presence  of  interfer¬ 
ence  or  noise  requires  some  type  of  statistical  detection 
strategy.  The  wavelet-based  detection  strategy  used  in 
this  report  is  based  on  the  research  of  Carmona  [21]. 
The  functional  diagram  of  our  wavelet-based  constant 
false  alarm  rate  (CFAR)  detector  is  shown  in  the  sche¬ 
matic  in  Figure  5. 

A  CFAR  detector  is  necessary  to  control  the  false 
alarm  rates  that  are  needed  to  adapt  to  varying  interfer¬ 
ence  and  noise.  Moreover,  any  properly  designed  CFAR 
detector  must  constantly  monitor  and  adaptively  esti¬ 
mate  the  CFAR  threshold,  because  a  well-known  way 
to  defeat  a  CFAR  detector  is  to  raise  the  noise  power 
gradually  [23].  To  do  this  a  CFAR  processor  must 
adaptively  estimate  the  probability  of  false  alarm  rate. 
Thus,  any  useful  CFAR  detector  must  develop  an  adap¬ 
tive  test  statistic  for  the  false  alarm  probability  Pfa 

=  Pr{ decide  s(t)  is  present :  s(t)  is  absent}  and  the  prob¬ 
ability  of  detection  Pci  =  Pr{  decide  s(t)  is  present :  s(t.)  is 

present}. 

The  original  image  is  in  the  upper  left  hand  panel, 
while  it  wavelet  transform  is  in  the  upper  right  hand 
panel.  After  the  wavelet  transform  the  pixel  values  in 
the  suhimage  are  lexicographically  ordered  by  convert¬ 
ing  the  subimage  to  an  n  x  1  vector.  Modeling  the 
density  functions  that  are  directly  dependent  on  the  d- 
scale  of  the  wavelet  transform,  while  perhaps  not  hope¬ 
less,  is  nontrivial.  [24]  One  fundamental  result  is 
known:  namely,  when  the  input  noise  is  Gaussian,  the 
density  functions  of  the  d-scales  are  also  Gaussian  [22] 
and  the  variance  is  decreased  by  2'j  as  the  scale  index 
integer]  is  decremented  For  this  reason  the  bootstrap 
technique  seems  ideally  suited  to  the  task  of  adaptively 
setting  CFAR  in  complex  scenes  with  non-Gaussian 
background  noise. 

The  lower  left  hand  panel  represents  the  bootstrap 
option.  A  basic  principle  of  the  bootstrap  method  as 
enunciated  by  Efron,  who  named  it  and  demonstrated  its 
scope,  is  that  it  represents  “the  substitution  of  compu¬ 
tational  power  for  theoretical  analysis.  The  payoff,  of 
course,  is  freedom  from  the  constraints  of  traditional 
parametric  theory,  with  its  over  reliance  on  a  small  set 
of  standard  models  for  which  theoretical  solution  are 
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Figure  4.  Original  Image  and  Wavelet  Subimages. 
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Figure  5.  Wavelet-Based  CFAR  Detector. 


available”  [25].  Thus,  the  bootstrap  relieves  us  from  number  of  training  sets  for  valid  inference.  According 
having  to  make  parametric  assumptions  about  the  un-  to  Hall,  the  insight  of  Efron  was  to  realize  that  in  com- 

derlying  background  noise.  However,  it  does  have  its  plex  situations,  when  bootstrap  statistics  are  awkward 

own  modeling  requirements  such  as  the  degree  of  corre-  to  compute,  they  may  be  approximated  by  Monte  Carlo 

lation  present  in  the  wavelet  coefficients  and  the  required  “resampling”  [26].  The  name  “bootstrap”  refers  to  the 

use  of  the  original  data  set  to  generate  new  data  sets  by 
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sampling  with  replacement.  A  schematic  of  the  basis 
bootstrap  resampling  processing  is  shown  in  Figure  6. 

Given  a  sample  x*  of  size  n,  random  samples  of  size  n 
are  drawn  with  replacement  B  times  from  the  original 
sample,  and  the  value  of  a  statistic  is  then  computed  for 
each  sample.  Although  there  is  not  a  complete  and 
rigorous  theory  for  our  extension  of  the  bootstrap  to 
these  problems,  we  take  our  inspiration  from  the  fol¬ 
lowing  guidelines  found  in  the  paper  Efron  and  Tibshi- 
rani  [27],  namely,  (1)  the  bootstrap  algorithm  can  be 
applied  to  almost  any  statistical  estimation  problem  or 
data  structure,  and  (2)  “the  statistic  t(x)  can  be  anything 
at  all,  as  long  as  we  can  compute  t(x*)  for  every  boot¬ 
strap  data  set  x*.” 
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FIGURE  6.  Basis  Bootstrap  Resampling  Process. 

The  computational  overhead  associated  with  the  boot¬ 
strap  is  directly  proportional  to  B  times  the  number  of 
operations  in  the  computation  of  T(d*).  For  a  sample  of 
size  n,  the  number  of  operations  is  proportional  to 
B*0(n)  for  a  maximally  decimated  Daubechies  wavelet 
transform  and  to  B*0(n  log  n)  for  an  undecimated  trans¬ 
form. 

3.1  MATCHED  FILTERING 

It  is  well  known  that  the  optimal  detector  for  known 
signals  in  additive  white  Gaussian  noise  is  the  matched 
filter,  which  is  normally  implemented  by  correlating  the 
received  signal,  r(t)  ,  with  the  known  signal,  s{t)  •  In 

discrete  time  we  compute 

g(r)  =  /  s 

where  r  and  s  are  vectors  containing  samples  of  the 
received  signal  and  known  signal.  These  statistics  are 
compared  to  a  threshold  to  decide  whether  the  signal  s 


was  present  during  the  observation  interval.  When  the 
arrival  time  of  the  signal  is  unknown  one  typically 
correlates  at  each  possible  time  shift.  The  output  of  the 
filter  exceeding  a  threshold  indicates  the  presence  of  the 
signal,  and  the  location  of  the  peak  output  indicates  the 
location  of  the  signal. 

For  non-white  Gaussian  noise  the  optimal  discrete¬ 
time  detector  uses  essentially  the  matched  filter  modified 
by  multiplication  by  the  inverse  of  the  noise  covariance 
matrix: 


g(r)  =•/  I 's 

Another  well-known  result  in  detection  theory  is  that, 
for  known  signals  in  additive  Gaussian  noise,  one  can 
expand  the  known  signal  and  the  received  signal  in 
another  orthonormal  basis  and  perform  the  detection 
process  in  the  coordinate  system  without  degrading  the 
overall  performance.  Daubechies’  wavelets  are  an  or¬ 
thogonal  transform.  When  the  known  signal  is  cor¬ 
rupted  by  non-Gaussian  noise,  there  is  usually  no 
simple  derivation  for  the  optimal  detector.  In  many 
cases  the  optimal  detector  is  unknown,  which  explains 
the  existence  of  numerous  competing  suboptimal  ap¬ 
proaches. 

To  evaluate  the  performance  of  the  wavelet  bootstrap 
detector  it  is  necessary  to  embed  and  compare  it  within 
the  matched  filter  context.  Figure  7  is  a  block  diagram 
illustrating  the  details  of  matched  filter  detection.  The 
original  derivation  is  by  Chen  and  Reed  [28]  with  help¬ 
ful  discussions  in  [29] 

Before  we  explain  the  concepts  associated  with  the 
main  block  s,  some  of  the  terms  should  be  defined.  An 
image  chip  is  a  sub-block  of  pixels  on  the  array;  often 
it  will  be  a  3  by  3  or  4  by  4  square  of  pixels.  The  noise 
sample  is  another  image  chip,  ideally  devoid  of  targets, 
that  is  selected  to  estimate  the  covariance  of  the  back¬ 
ground  noise/clutter.  The  template  s  is  a  model  of  the 
target  and  can  be  quite  detailed.  For  small  targets  it  is 
often  a  simple  numerical  approximation  for  the  optical 
point  spread  function.  Using  CFAR  detectors  is  equiva¬ 
lent  to  testing  the  hypothesis  H0(no  target  present,  only 
clutter/noise)  denoted  by  £  versus  the  alternative  hy¬ 
pothesis  (target  and  clutter/noise  present)  denoted  by 
b  +  £  by  using  a  simple  threshold.  The  clutter  co- 
variance  is  denoted  by  and  its  inverse  multiplied 

by  the  transpose  of  the  lexicographically  ordered  tem¬ 
plate  is  the  matched  filter.  If  the  clutter  distribution  is 
Gaussian  and  the  filter  is  linear,  then  the  filtered  clutter 
is  also  Gaussian.  Assuming  a  Gaussian  clutter  model, 
the  Gaussian  log  likelihood  ratio  test  [28]  is  designed  to 
make  the  optimum  distinction  between  the  two  hy¬ 
potheses.  The  threshold  is  set  by  noting  that  these 
assumptions  are  equivalent  to  testing  a  mean  shift  mod- 
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Figure  7.  Matched  Filter  Detection . 


eled  by  the  template  s  in  the  multivariate  normal  distri¬ 
butions 

H,  x  =  s  +  £  X~  N  (S,  X) 

H„  x  =  £  X  ~  N(0,£) 

Thus,  the  false  alarm  can  be  estimated  using  the  multi¬ 
variate  normal  with  zero  mean  and  sample  variance 
Setting  the  CFAR  rate  is  a  critical  step  for 

infrared  images,  because  if  the  background  clutter  is  not 
always  well  approximated  by  the  Gaussian  hypothesis, 
higher  false  alarm  rates  will  result. 

3.2  Spatial  Clutter  Statistics 

If  the  Gaussian  assumption  is  violated,  out¬ 
liers  will  be  present  and  manifest  themselves  in  the  tail 
of  the  distribution.  Outliers  are  “rogue  observations” 
that  lie,  in  some  sense,  far  from  the  middle  of  the  data 
and  bias  the  false  alarm  estimate.  A  quick  and  effective 
means  with  which  to  visualize  the  tail  behavior  in 
empirical  data  is  provided  by  the  quantile-quantile  plot 
(Q-Q  plot).  It  can  be  used  to  compare  the  degree  of 
agreement  between  two  empirical  distributions,  or  it 
can  be  used  to  compare  the  empirical  quantiles  with  the 
quantiles  from  an  ideal  distribution.  Two  distributions 
are  compared  by  graphing  quantiles  of  one  distribution 
against  the  corresponding  quantiles  of  the  other.  A 
normal  Q-Q  plot  is  a  plot  of  the  ordered  data  y{  from 

the  sample  {xo>* ’ SXv-i}  versus  yp  =  O"1  (/?,), 
where  p(  =  (i  —  J^/N,  i  =  and  O-1  is 


the  inverse  of  the  standard  normal  distribution  [30].  If 
the  shape  of  the  unknown  distribution  is  approximately 
normal,  even  in  the  tails,  then  the  empirical  quantile 
sample  values  will  approximate  the  normal  line.  A 
normal  Q-Q  plot  will  reveal  the  presence  of  outliers  in 
the  extreme  tails  or  leptokurtic  shape.  Both  are  strong 
indicators  that  robust  methods  are  required.  The  shape  is 
leptokurtic  as  defined  in  Cleveland  [31  ]if  the  relative 
density  of  the  data  in  the  middle  of  the  distribution 
compared  with  the  density  in  the  tails  is  thinner  than 
the  normal  distribution.  Figure  8  displays  a  Q-Q  plot 
using  a  64  by  64  subimage  taken  from  a  Skyball  im¬ 
age.  These  samples  clearly  exhibit  non-normal  behavior 
in  the  tails.  Moreover,  the  subimage  dimensions  are 
typical  of  the  subimage  tiling  that  is  often  used  to 
localize  the  scene  statistics. 

3.21  Fixed  pattern  noise 
In  a  series  of  seminal  papers,  Flandrin  [32],  and 
Wornell  [33]  proved  that  the  Daubechies  wavelets  will 
decorrelate  or  “whiten”  a  broad  class  of  stochastic  proc¬ 
esses  that  have  a  1/f-type  spectra.  Earlier,  Scribner  and 
others  [2-4]  have  observed  that  a  major  contributor  to 
the  fixed  pattern  nonuniformity  observed  on  infrared 
staring  arrays  is  primarily  due  to  the  1/f-type  spectra 
associated  with  the  sampled  output  of  the  individual 
pixels.  Using  measurement  from  an  Amber  128  by  128 
staring  array,  Hewer  and  Kuo  [5]  combined  these  two 
results  to  demonstrate  that  the  Daubechies  wavelets  do 
“whiten”  the  1/f-type  spectra  as  predicted.  These  trans¬ 
forms  effectively  decorrelate  the  fixed  pattern  noise 
inherent  on  indium  antimonide  (InSb)  and  mercury 
cadmium  telluride  (HgCdTe)  starring  arrays. 
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(1) 


Quantiles  of  Standard  Normal 

Figure  8.  Q-Q  plot  of  data  from  a  Skyball  image. 
Note  the  deviation  from  the  straight  line  in  the  tails, 
indicating  non  normal  behavior. 

3.3  Optimal  trade-off  synthetic  discrimi¬ 
nant  filters  (SDF’s) 

As  an  adjunct  to  the  matched  filtering,  a  class  of  cor¬ 
relation  filters  as  developed  by  Mahalanobis  et  al  [9- 
13],  has  been  implemented.  These  filters  have  excep¬ 
tional  tolerance  to  scaling  and  rotation  distortions.  The 
tolerance  of  the  filters  is  incorporated  through  the  selec¬ 
tion  of  an  appropriate  training  set,  and  can  be  tuned  to 
provide  high  (generalization)  or  low  (specificity)  toler¬ 
ance. 

In  the  discussion  of  the  MACH  (maximum  average 
correlation  height)  filters  that  follows,  bold  lowercase 
indicates  a  column  vector,  while  bold  uppercase  repre¬ 
sents  a  diagonal  matrix.  The  filters  result  from  maxi¬ 
mizing  the  ratio 


j(h)  = 


h+m 

h+Sh 


where  h  is  the  correlation  filter  and  m  is  the  average  of 
the  training  images  in  the  Fourier  domain.  Each  image 
is  lexigraphically  ordered  to  form  a  vector.  S  is  the 
average  similarity  measure  matrix 

S=  I(Xk-M)(Xk-M)+  (2). 

k=l 


In  eq.  (2)  Xk  are  the  individual  training  images,  again 
in  the  Fourier  domain.  The  training  image  is  lexi¬ 
graphically  ordered  and  its  elements  placed  on  the  di¬ 
agonal  of  Xk.  M  is  the  mean  training  image,  arranged 
similarly  to  Xk.  All  of  the  processing  to  generate  the 
filters  is  performed  in  the  Fourier  domain,  to  gain 
translational  invariance.  It  is  possible  to  perform  the 
processing  in  other  domains  ( e.g .  wavelet  or  spatial) 
but  care  must  be  taken  to  properly  register  the  training 
imagery. 

The  optimal  filter  h  is  then  given  by 

h  =  S'1  m  (3). 

Variants  on  the  MACH  filter  can  be  achieved  by  vary¬ 
ing  the  performance  metric  one  wishes  to  maximize. 
For  example  Refrieger  [13]  has  developed  optimal  trade¬ 
off  synthetic  discriminant  filters  (OTSDF’s)  which 
attempt  to  minimize  the  energy  functional 

E(h)  =  h+Qh  -  8 


h+ 


m 


(4) 


where 

Q  =  aP  +  pD  +  y  S  (5). 

S  is  as  defined  previously,  while  P  is  the  power  spec¬ 
tral  density  of  the  expected  noise,  and  D  is  the  average 
power  spectral  density  of  the  training  set.  The  constants 
a,  [3  ,  y,  8  are  non-negative  and  must  satisfy  a2+(32  + 
y2  +  62  =  k  where  k  is  any  positive  constant.  Mini¬ 
mizing  E(h)  results  in 

h  =  Q]m  (6). 

2 

By  varying  the  parameters,  one  can  adapt  the  filter  for 
the  optimal  performance  for  the  situation  under  study. 
If  one  sets  a  =  (3  =  0  ,  the  result  is  the  MACH  filter 
discussed  earlier.  Further  variations  can  be  made  to  the 


basic  idea,  including  the  extension  to  multiple  class 
discrimination  using  distance  classifier  correlation  fil¬ 
ters  (DCCF’s),  which  are  able  to  distinguish  between 
multiple  classes  of  similar  objects  (e.g.  T72’s  vs. 
M1A1  tanks). 

The  class  of  OTSDF  filters  was  chosen  for  the  feature 
detection  for  several  reasons.  As  discussed,  the  filters 
can  incorporate  varying  degrees  of  distortion  tolerance 
and  be  built  to  generalize  classes  of  targets.  Another 
benefit  of  the  algorithm  is  that  the  result  is  statistically 
optimum,  and  depends  on  a  realistic,  mathematically 
rigorous  optimaztion  procedure,  as  opposed  to  other 
heuristic  methods.  A  final  consideration  is  the  compu¬ 
tational  efficiency.  The  MACH  filters  require  no  seg¬ 
mentation  or  edge  detection  preprocessing  and  the 
correlation  step  can  be  performed  rapidly  using  dedicated 
FFT  hardware.  However,  the  limit  of  its  performance 
may  be  reached  when  considering  small  (<  5x5  pixels). 
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I  his  limit  is  to  be  investigated  more  thoroughly  in 
future  work. 

4.  Integrated  detection  and  tracking 

The  functional  diagram  of  our  overall  integrated  detec¬ 
tion  and  tracking  design  is  shown  in  the  schematic  in 
Figure  9.  The  original  image  is  in  the  upper  left  hand 
panel,  while  it  wavelet  transform  is  in  the  upper  right 
hand  panel.  The  two  images  in  the  lower  left  hand 
panel  illustrate  optical  flow.  Optical-flow  fields,  which 
describe  image  domain  motion,  extracted  from  two  or 
more  images,  must  discriminate  between  motion  and 
local  illumination  changes  due  to  the  intrinsic  noise 
field  on  the  array,  and  the  illumination  changes  due  to 
the  clutter  field  in  the  sensor’s  field  of  view.  Being  able 
to  accurately  estimate  the  velocity  field  when  sub-pixel 
and  multi-pixel  interframe  changes  are  present  is  essen¬ 
tial  and  difficult  especially  for  dim  targets  moving 
relative  to  the  background  .  Nobody  has  a  satisfactory 
solution  for  optical  flow  ,  when  dynamic  occlusion  is 
present — the  estimation  of  optical  flow  is  not  yet  a 
mature  subject.  The  highlighted  streak  in  the  lower 
right  hand  image  represents  a  track  history  from  a 
Kalman  filter.  When  detecting  low  intensity  targets  in 
dense  clutter  the  allowable  false  target  confirmation 
rate  is  a  system  requirement.  Thus,  an  integrated 
system  must  include  a  system  tradeoff  between  tracking 
complexity,  scan  rate,  and  detection  threshold.  The 
tracking  methodology  in  this  research  will  borrow 
heavily  from  the  research  of  Blackman  [34]  and  Bar- 
Shalom  [35].  The  challenge  here  is  to  integrate  the 
wavelet-bootstrap  detector  and  the  SCF  for  multipixel 
targets  with  their  data  association  algorithms  and  Kal¬ 
man  filters  for  track  file  maintenance. 

The  SDF’s  are  used  to  estimate  the  interframe  motion. 
Given  a  set  of  potential  targets  as  detected  by  the  wav¬ 
elet  matched  fi\er,  the  SDF  detection/classification  is 
performed  to  yield  an  accurate  assessment  of  the  posi¬ 
tion  of  the  target.  In  the  next  frame,  the  SDF  algo¬ 
rithm  is  performed  again,  searching  in  areas  which 
previously  contained  targets.  From  these  two  frame 
results,  it  is  possible  to  estimate  the  target  motion. 
With  the  motion  estimate  in  hand,  it  is  possible  to  feed 
this  data  into  a  Kalman  filter  and  update  the  filter  more 
accurately.  In  some  test  cases,  extremely  accurate  track¬ 
ing  performance  was  demonstrated.  The  lower  limits  of 
the  algorithm,  in  terms  of  SNR  and  target  size,  is  yet  to 
be  determined,  and  is  a  basis  for  future  work. 


5.  Results 

5.1  Imagery  and  Data 

Before  presenting  our  results,  a  brief  description  of  the 
data  sets  that  were  used  will  be  presented.  Several  data¬ 
bases  have  been  used  in  this  effort.  Blackbody  meas¬ 
urements  to  characterize  fixed  pattern  noise  were 
obtained  during  our  early  research  using  an  Amber  AE 
4128  128  by  128  InSb  staring  array  that  operates  at 
mid  wave  infrared  wavelengths.  Additional  blackbody 
images  from  a  Honeywell  uncooled  microbolometer  336 
by  165-pixel  IR  sensor  at  long  wave  infrared  wave¬ 
lengths  were  obtained  from  D.  A.  Scribner,  J.  T.  Caul¬ 
field,  and  M.  R.  Kruer  of  the  Electro-Optical 
Technology  Branch  at  the  Naval  Research  Laboratory 


Wavelet 

Transform 


Motion 

Estimate 


Figure  9  Integrated  detection  and  track 

(NRL),  Washington,  D.C.  A  bolometer  is  an  infrared 
detector  that  measures  absorbed,  incident  infrared  radia¬ 
tion  by  a  voltage  change  in  electrical  resistance  due  to  a 
nonequilibrium  temperature  differential.  They  also  pro- 
vid  ’  lr30  Amber  256  by  256  images  taken  during 
tests  overlooking  the  Chesapeake  Bay. 

Additional  digital  IR  image  sequences  collected  from 
an  airborne  platform  using  the  Skyball  two-color  IR 
seeker.  Skyball  was  a  joint,  1.5-year,  digital  data- 
collection  effort  by  the  Naval  Air  Warfare  Center  Weap¬ 
ons  Division,  China  Lake,  California,  and  Hughes 
Missile  Systems.  The  Skyball  sensors  consisted  of  two 
128  by  128  HgCdTe  FPAs  operating  simultaneously  in 
both  mid-  and  long-wave  IR  spectral  bands  mounted  in 
the  nose  of  a  Jetstar-8  aircraft  with  an  in-flight  black¬ 
body  calibrator.  Typical  frame  rates  varied  typically 
from  25.6  to  49.1  frames  per  second.  Unclassified 
Skyball  image  sets  include  a  statistically  significant 
sample  of  cluttered  backgrounds  flown  at  varying 
ranges,  geometries,  and  seasons.  Very  large  format, 
extremely  high  spatial  resolution  12-bit  digital  IR  im¬ 
age  sequences  were  also  acquired  from  the  Airborne 
Infrared  Measurement  System  program  sponsored  by 
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DARPA.  A  full  description  of  the  data  can  be  found  in 
reference  [36]. 

To  quantify  algorithm  performance,  an  interactive 
approach  between  algorithm  testing  and  development 
was  employed  by  using  synthetic  targets.  Routines  were 
written  to  generate  subpixel-size  to  N-  by  N-size  raised 
Gaussian  targets.  Target  sizes  are  presently  limited  to 
subpixel  up  to  7  by  7  pixels.  This  range  has  been  ade¬ 
quate  for  small  target-detection  investigations  to  date. 
The  method  can  be  readily  extended  to  larger  sizes, 
should  this  be  required.  The  software  enables  the  re¬ 
searcher  to  insert  anti-aliased  Gaussian  targets  at  user- 
specified  pixel  locations  with  user-designated  peak  am¬ 
plitudes.  Polarity  reversal  of  contrast  is  supported  so 
that  targets  may  fall  through  the  neighboring  back¬ 
ground  level  and  rise  back  up  through  it.  Peak  ampli¬ 
tude,  amplitude  change,  rate  of  change,  size,  and 
location  are  specified  by  the  user  and  can  be  adjusted  to 
meet  requirements.  Care  has  been  taken  during  the  ren¬ 
dering  process  to  preserve  target  edge  contrast  and 
minimize  aliasing  so  that  any  artificial  edge  enhance¬ 
ment  due  to  target  insertion  is  minimized 
5.2  Results 

All  of  the  comparative  detectors  were  applied  to  the 
d3-d3  subimage  described  in  Figure  3.  For  very  small 
targets  occupying  a  square  image  chip  (i.e.,  <5  X  5  or 
fewer  pixels),  the  matched  filter  shape  is  matched  to  the 
blur  circle.  The  three  matched  filter  detectors  using  a 
sampling  annulus  are:  1)  empirical  covariance  assum¬ 
ing  a  Gaussian  distribution  to  set  the  detection  thresh¬ 
old  [1],  2)  empirical  covariance  using  a  bootstrap 

method  to  set  the  detection  threshold,  3)  autocovariance 
using  a  bootstrap  method  to  set  the  detection  threshold. 
Two  other  detectors,  namely,  4)  a  matched  filter  without 
a  sampling  annulus  combined  with  an  autocovariance 
and  bootstrap  threshold  5)  a  bootstrap  threshold 
extracted  directly  in  the  d3-d3  subimage  were  also 
studied,  statistics.  All  video  sequences  were  fifty  frames 
in  length  and  used  embedded  Gaussian  stationary  targets. 
The  CFAR  detection  statistics  were  estimated  from  the 
d3-d3  subimage  of  the  first  frame  and  then  applied  to  the 
entire  sequence.  Further  details  are  forthcoming  in  the 
NAWCWPNS  Technical  Publication  [39].  If  the  sam¬ 
pling  chip  in  Figure  9  should  accidentally  include  the 
target,  then  a  lower-than-expected  probability  of  detec¬ 
tion  will  result  [27],  To  avoid  this  possibility  a  sam¬ 
pling  annulus,  as  suggested  by  Singer  and  Saski  [1],  is 
placed  around  the  suspected  target  area  and  the  covari¬ 
ance  estimate  is  obtained  from  it 

The  background  clutter  used  in  these  investigations 
included  Skyball  and  Chesapeake  Bay  IR  imagery.  The 
Skyball  sequences  selected  for  the  comparative  study 
represent  mountainous  and  urban  -  light  industrial  and 


residential  terrain.  The  mountainous  background  data 
collected  during  daylight  hours  over  the  Sierra  Nevada  is 
comprised  primarily  of  forest,  snow  and  rock.  The  mid¬ 
wave  infrared  urban  images  were  collected  at  night 
along  U.S  101  near  the  Santa  Barbara,  CA  coast.  The 
sequences  included  light  industrial  urban  areas  with  the 
highway  and  freeway  interchange  visible,  small  to  large 
several  story  buildings  and  residential  areas.  The 
Chesapeake  Bay  video  imagery  comprised  of  ocean  and 
sky  background  c  was  collected  using  an  on  shore 
stationary  Amber  camera. 

The  standard  SNR  as  given  in  [38]  is  defined  as 


SNR 


101og10 


1  n  9 
—  X  signal 

nJ= 1 


var  (noise) 


V 


J 


where  signal  is  pixel  amplitude,  j  denotes  pixel 
number,  and  VI  is  the  number  of  target  pixels  in  the 
sampling  chip.  In  these  studies  a  10  x  10  chip  was 
used. 

The  design  of  a  simple  binary  hypothesis  detector  is 
such  that  any  pixel  value  that  exceeds  the  threshold  will 
be  classified  as  a  target.  If  a  single  detection  threshold  is 
used  for  the  entile  infrared  array,  then  any  patchy  outlier 
clutter  response  or  fixed  pattern  noise  will  force  a  com¬ 
promise  between  the  false  alarm  rate  and  the  local  tar- 
get-to-clutter  ratio.  Thus,  the  standard  receiver  operating 
characteristic  or  CFAR  performance  curves  that  use  the 
dichotomous  probability  of  detection  versus  probability 
of  false  alarm  will  require  very  high  target-to-clutter 
ratios.  Such  high  ratios  will  understate  the  performance 
of  any  detector  that  has  a  high  probability  of  detecting 
the  target,  while  simultaneously  admitting  an  acceptable 
number  of  threshold  exceedances.  One  major  function  of 
Kalman  filtering  is  to  provide  data  association  files  to 
manage  threshold  exceedances.  For  this  reason,  the 
number  of  false  positives  is  a  useful  performance 
metric. 

In  the  Chesapeake  Bay  sequence,  all  detectors  were 
able  to  detect  the  target  with  a  probability  of  detection 
(Pd)  of  100%  down  to  a  SNR  of  -4  dB.  However,  at  the 
lowest  detected  SNR  the  average  number  of  false  posi¬ 
tives  per  frame  for  the  fifty  frames  was  quite  variable. 
The  range  was  as  follows,  Detector  1  has  0.52  false 
positives,  Detector  2  had  2.42  false  positives,  Detector 
3  had  6.8  false  positives,  Detector  4  had  16.02  false 
positives,  and  Detector  5  had  4.02  false  positives.  This 
data  set  was  unique  in  that  the  dynamic  range  of  the 
pixel  amplitudes  in  each  of  these  frames  was  6-bits. 
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In  the  Skyball  IR  long  wavelength  mountainous 
terrain  sequence.  Detector  1  had  a  Pd  of  6%  with  0.33 
false  positives  per  frame  at  a  SNR  of  2  dB.  It  could 
not  reliably  detect  targets  at  lower  SNRs.  Detector  2 
had  a  Pd  of  6%  with  3.33  false  positives  at  a  SNR  of  - 
5  dB  while  Detector  3  had  Pd  of  30%  with  1.33  false 
positives.  Detector  4  exhibited  100%  Pd  down  to  a 
SNR  of  -5  dB  with  13.42  false  positives  per  frame,  and 
Detector  5  had  a  Pd  of  6%  with  0.67  false  positives  at  - 
5  dB  SNR. 

In  the  Skyball  mid-wavelength  IR  mountainous  ter¬ 
rain  sequence,  Detector  1  had  a  Pd  of  4%  with 
0.5  false  positives  at  a  SNR  of  -6  dB.  It  had  no  further 
reliable  detections  at  lower  SNRs.  At  a  SNR  of -10 
dB  Detector  2  had  a  Pd  of  18%  with  2.78  false  posi¬ 
tives  ,  Detector  3  had  a  Pd  of  18  %  with  5.0  false  posi¬ 
tives,  Detector  4  had  a  Pd  of  100%  with  10.70  false 
positives  per  frame.  Detector  5  had  Pd  of  2%  with  2.0 
false  positives  at  SNR  of  -9  dB.  It  was  unable  to  relia¬ 
bly  detect  the  target  below  -9dB  SNR. 

In  the  Skyball  mid  wavelength  urban-light  industrial 
sequence,  Detectors  1,  2,  and  5  could  not  reliably  detect 
the  target  at  a  SNR  of  -1  dB.  No  higher  SNR  data 
were  run  with  this  background.  Detector  3  had  a  Pd 
of  94%  with  1.19  false  positives  at  a  SNR  of  -6  dB 
while  Detector  4  had  a  Pd  of  100%  with  13.46  false 
positives.  At  a  SNR  of  -13  dB  Detector  3  had  a  Pd  of 
50%  with  1.0  false  positives  while  Detector  4  had  a  Pd 
of  52%  with  13.77  false  positives  . 

5.3  CONCLUSIONS 

Wavelet  transforms  combined  with  matched  filtering 
represent  a  computationally  efficient  algorithr  to 
locate  small  target  candidates  on  an  imaging  array. 
These  filters  are  compatible  with  CFAR  detectors 
methodology.  II  jr,  robustly  setting  the  CFAR 
threshold  is  still  one  of  the  biggest  challenges  in  infra¬ 
red  detection,  because  of  the  limitations  of  parametric 
models  to  capture  the  structured  clutter  diversity.  For 
this  reason  the  results  are  open  to  further  interpretation 
and  research.  While  the  bootstrap  can  accommodate  the 
non-Gaussian  clutter  distributions,  its  overall  perform¬ 
ance  will  depend  on  being  able  to  manage  the  resulting 
false  alarm  rate,  which  ultimately  must  be  evaluated 
within  a  track  file  context  in  order  assess  its  final  im¬ 
pact. 

The  use  of  synthetic  discriminant  filters  to  estimate 
optical  flow  of  small  targets  is  a  new  approach.  The 
limiting  factor  is  the  target  size.  Further  work  needs  to 
be  done  do  determine  the  lower  limit  on  target  size.  It 
is  known  that  reliable  results  can  be  achieved  with 
targets  of  5x5  pixels;  results  for  smaller  targets  are 
forthcoming.  For  midcourse  corrections  and  the  end 


game  scenario,  the  SDF’s  are  able  to  improve  the  accu¬ 
racy  of  the  tracking,  and  also  able  to  discriminate  be¬ 
tween  types  of  targets. 
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